Deriving Morphological Analyzers from Example Inflections
نویسندگان
چکیده
This paper presents a semi-automatic method to derive morphological analyzers from a limited number of example inflections suitable for languages with alphabetic writing systems. The system we present learns the inflectional behavior of morphological paradigms from examples and converts the learned paradigms into a finite-state transducer that is able to map inflected forms of previously unseen words into lemmas and corresponding morphosyntactic descriptions. We evaluate the system when provided with inflection tables for several languages collected from the Wiktionary.
منابع مشابه
Unlimited vocabulary speech recognition for agglutinative languages
It is practically impossible to build a word-based lexicon for speech recognition in agglutinative languages that would cover all the relevant words. The problem is that words are generally built by concatenating several prefixes and suffixes to the word roots. Together with compounding and inflections this leads to millions of different, but still frequent word forms. Due to inflections, ambig...
متن کاملLearning Transducer Models for Morphological Analysis from Example Inflections
In this paper, we present a method to convert morphological inflection tables into unweighted and weighted finite transducers that perform parsing and generation. These transducers model the inflectional behavior of morphological paradigms induced from examples and can map inflected forms of previously unseen word forms into their lemmas and give morphosyntactic descriptions of them. The system...
متن کاملEnhancing Morphological Analyzers by Unknown Word Decomposition
This paper describes an approach how to integrate the decomposition of non-lexicalized word compounds and derivations into the morphological analyzers of a company's NLP product line. The component employs word formation rules and filtering techniques to decompose words, which are not contained in the underlying dictionary database, thereby increasing the average word recognition rate of the mo...
متن کاملAssessment Criteria for Benchmarking Arabic Morphological Analyzers and Generators
Natural language processing applications are based on the morphology part. So they should meet some criteria in order to satisfy the required functionality. Assessing and evaluating of Arabic morphological systems depend on the input words and resulted output according to a predefined criteria to measure and analyze given system in order to study its weakness and strength, trying to find an Ara...
متن کاملEnhanced Search with Wildcards and Morphological Inflections in the Google Books Ngram Viewer
We present a new version of the Google Books Ngram Viewer, which plots the frequency of words and phrases over the last five centuries; its data encompasses 6% of the world’s published books. The new Viewer adds three features for more powerful search: wildcards, morphological inflections, and capitalization. These additions allow the discovery of patterns that were previously difficult to find...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016